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The QSAT problem is the quantified version of the SAT problem. 
We show the existence of a threshold effect for the phase transi- 
tion associated with the satisfiability of random quantified extended 
2-CNF formulas. We consider boolean CNF formulas of the form 
yX3Y<p(X, Y), where X has m variables, Y has n variables and each 
clause in <f> has one literal from X and two from Y . For such for- 
mulas, we show that the threshold phenomenon is controlled by the 
ratio between the number of clauses and the number n of existential 
variables. Then we give the exact location of the associated critical 
ratio c*. Indeed, we prove that c* is a decreasing function of a, where 
a is the limiting value of m/log(n) when n tends to infinity. 

1. Introduction. A significant tool for SAT research has been the 
study of random instances. It has stimulated fruitful interactions among 
the areas of artificial intelligence, theoretical computer science, mathemat- 
ics and statistical physics. Recently there has been a growth of interest in 
a powerful generalization of the Boolean satisfiability, namely the satisfia- 
bility of Quantified Boolean formulas, QBFs. Compared to the well-known 
propositional formulas, QBFs permit both universal and existential quan- 
tifiers over Boolean variables. Thus QBFs allow the modelling of problems 
having higher complexity than SAT, ranging in the polynomial hierarchy up 
to PSPACE. These problems include problems from the areas of verification, 
knowledge representation and logic (see, e.g., [10J). 

Models for generating random instances of QBF have been proposed 
|12l [3]. Problems for which one can combine practical experiments with 
theoretical studies are natural candidates for first investigations [5]. In this 
paper, we focus on a certain subclass of closed quantified Boolean formulas, 
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which can be seen as quantified extended 2-CNF-formulas. These formulas 
bear similarities with 2-CNF-formulas, whose random instances have been 
extensively studied in the literature (see, e.g., [4j [I3j dU El [8] ) . At the same 
time, the introduction of quantifiers increases the complexity and requires 
additional parameters for the generation of random instances. More pre- 
cisely, we are interested in closed formulas in conjunctive normal form (CNF) 
having two quantifier blocks, namely in formulas of the type VX3Y(p(X, Y), 
where X and Y denote distinct sets of variables, and (p(X, Y) is a conjunc- 
tion of 3-clauses, each of which containing exactly one universal literal and 
two existential ones. Evaluating the truth value of such formulas is known 
to be coNP-complete [11]. In order to generate random instances we have to 
introduce several parameters. The first one is the pair (m, n) that specifies 
the number of variables in each quantifier block, i.e., in X and Y . The second 
one is L = \_cn\ , the number of clauses. We shall study the probability that 
a formula drawn at random uniformly out of this set of formulas evaluates 
to true as n tends to infinity. We will denote by P mjC (n) this probability. 
Thus, we are interested in 

lim P mc (n). 

n— >+oo 

Let us recall that the transition from satisfiability to unsatisfiability for 
random 2-CNF formulas is sharp. Indeed, there is a critical value (or a 
threshold) of the ratio of the number of clauses to the number of variables, 
above which the likelihood of a random 2-CNF-formula being satisfiable van- 
ishes as n tends to infinity, and below which it goes to 1. Moreover, this 
critical value is known to be 1 (see [31 I13j). 

On the one hand observe that, when m = 1, a (l,2)-QCNF-formula with L, 
clauses can be seen as the conjunction of two independent 2-CNF-formulas 
(each of which corresponds to an assignment to the universal variable and 
has on average L/2 clauses). On the other hand, when m is large enough, a 
random (l,2)-QCNF-formula with L = \_cn\ clauses has essentially strictly 
distinct universal literals, and then behaves as an existential 2-CNF-formula. 
Thus, we can easily prove that the transition between satisfiability and un- 
satisfiability for random (l,2)-QCNF-formulas occurs when c is between 1 
and 2. Our main contribution is to identify the scale for m (as a func- 
tion of n) at which an intermediate and original regime can be observed, 
m = [a log nj . Moreover, at this specific scale in developing further the 
techniques used by Chvatal and Reed [3J, and Goerdt [13j . we get the pre- 
cise location of the threshold as a function of a. Our main result is: 

Theorem 1.1 For any a > 0, there exists c*(a) > such that: 
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if c < c*(a), then PL a i nn j iC 
if c> c*(a), then P^innJ, 



n— >+oo 



n— »+oo 



0. 



Moreover, the critical ratio c*(a) is given by 



c a 



the unique root of lnc + ^ lj ln(2 — c 



1 

a 



i/ a In 2 < 1 
if a In 2 > 1 



Figure [I] shows the evolution of the critical ratio c*(a) as a function of a. 
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Fig 1 . Evolution of the critical ratio values. 



The paper is organized as follows. In Section [2] we examine the com- 
plexity of deciding the truth value of a (l,2)-QCNF-formula. In order to 
make the paper self-contained, we give there an alternative proof of the 
coNP-completeness of this problem. In Section [3] we characterize the truth 
of (l,2)-QCNF-formulas. We introduce specific substructures, comparable to 
the ones introduced by Chvatal and Reed in [3]: we define pure bicycles, 
which are necessary to ensure the falsity of a (l,2)-QCNF-formula, and pure 
snakes, whose appearance is sufficient to ensure the falsity. In Section \3. 21 we 
give some enumerative results concerning pure bicycles and snakes, which 
will be useful for determining the location of the threshold. In Section [J] we 
present the probabilistic model and we give first estimates for the location of 
the threshold. In Section [5] we prove our main result, Theorem II .li Finally, 
Section [6] contains the proof of a technical proposition. 



2. The complexity of (1,2)-QSAT. A literal is a propositional vari- 
able or its negation. The atom of a literal I is the variable p if I is p or p. 
Literals are said to be strictly distinct when their corresponding atoms are 
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pairwise different. A clause is a finite disjunction of literals. A formula is in 
conjunctive normal form (CNF) if it is a conjunction of clauses. A formula is 
in fc-CNF, if any clause consists of exactly k literals. Here we are interested 
in quantified propositional formulas of the form 

F = yX3Yip(X, Y) 

where X = {x\, . . . , x m }, and Y = {yx, . . . , y n }, and tp(X, Y) is a 3-CNF 
formula, with exactly one universal and two existential literals in each clause. 
We will call such formulas (l,2)-QCNFs. These formulas can be considered 
as quantified extended 2-CNF formulas, because deleting the only universal 
literal in each clause and removing the then superfluous V-quantifiers result 
in an existentially quantified conjunction of binary clauses. 

A truth assignment for the existential (resp. universal) variables, X (resp. 
Y) is a Boolean function I : X — ► {0, 1} (resp. Y — ► {0, 1}), which can be 
extended to literals by /(as) = 1 — I(x) 

A (1,2)-QCNF formula is true (or satisfiable) if for every assignment to the 
variables X, there exists an assignment to the variables Y such that <p is true 
under this assignment. The exhaustive algorithm which consists in deciding 
whether for all assignment to the variables X, there exists an assignment 
to the variables Y such that ip is true provides a first upper bound for the 
worst case complexity. Indeed, since the satisfiability of a 2-CNF formula can 
be decided in linear time p], the evaluation of the formula VX3Y(p(X,Y) 
can be performed in time 0(2 m ■ \tp\), where m is the number of universal 
variables and \ip\ denotes the size of ip. Observe that, if m is of the order of 
logra, then it provides a polynomial time algorithm. 

In its full generality the problem (1,2)-QSAT is much harder as stated in 
the following theorem. This theorem was proved originally in [llj. In order 
to make the paper self-contained, we give here an alternative proof. 

Theorem 2.1 |llj The evaluation problem (1,2)-QSAT is coNP -complete. 

Proof: To show membership in coNP, guess a vector of truth values v\, . . . , v n 
corresponding to x\, . . . , x m . Replace in 3Yip(X, Y) all free occurrences of 
any X{ by Vi, remove from the clauses and delete clauses with 1. The re- 
sulting formula is a 2-QCNF formula, whose unsatisfiability (i.e. falsity) can 
be decided in linear time (see [I] for the details). 

It remains to be shown that the problem is coNP-hard. We show this by 
a polynomial-time computable reduction from the satisfiability problem for 
3-CNF formulas. 

Consider such a formula 

a: ai A . . . A a n (n > 2) 
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over the variables {X2, ■ ■ ■ ,x m } where each on is a disjunction of exactly 
three literals li t \, k.2 and ^3. We construct ^(a), a (1,2)-QCNF formula. 
Then we show that 

(1) a is satisfiable if and only if ^(a) is false. 

The reduction is as follows. We first choose n variables yi,...,y n , all of 
which are different from the variables X2, ■ ■ ■ , x m occurring in a. We take any 
minimally unsatisfiable 2-CNF formula with n + 1 clause, e.g., ip = AiLo V'i 
where 

!mVm ifi = 0; 
ViVVi+i if i S {1,... ,n- 1}; 
Un-i V y n if i = n. 
For each clause «j = (/o V V ^3) occurring in a, we define 

= V Vi, 
^i,2 = ii,2 V ipi, 
A,3 = k,3 V Vi- 

Let xi be a new variable, i.e., xi is different from the ones in {y\, . . . , y n } 
and {x2, • • • , x m }. Then 

n 

*(a) : VxiVx 2 • ■ ■ Vx m 3yi • • • 3y n ((a* V ip ) A /\ A A ^,3))- 

i=i 

Obviously, the reduction is polynomial-time computable. 

We next prove ([1]). Observe that the formula resulting from ^(a) by any 
instantiation of the conjunction of clauses (maybe with repetitions) 

from if). Therefore, since ip is minimally unsatisfiable, this formula will be 
unsatisfiable if and only if every clause from ip occurs. 
==^: Suppose a is satisfiable. Take an arbitrary truth assignment / : X — > 
{0, 1}, which satisfies a. Then, for all i = 1, . . . ,n, there is (at least) one 
j £ {1, 2, 3}, such that I{h,j) = 1- I n the formula 3y\ ■ ■ ■ 3y n ((xi V V'o) A 
AiLiCV^i A tpi t 2 A ^,3)), replace all free occurrences of Xi by I(xi) for i = 
2, . . . , m and x\ by 0. Observe that, whenever ij j in tpij (for some j G 
{1,2,3}) is true, we get ipi after simplification. Therefore, in the existential 
2-CNF formula obtained after simplification it remains the clause tpQ and 
at least one copy of each clause tpi for every i = 1, . . . , n (the one resulting 
from ipij, for which I(hj) = 1). Therefore, this formula is unsatisfiable, thus 
proving that \^(a) is false. 
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<=: Suppose ^(a) is false. Then, there is a vector of truth values v\, . . . , v m 
corresponding to x%, . . . ,x m , such that the 2-QCNF formula obtained by 
replacing all occurrences of any x,- t by Vi is unsatisfiable. Since ip = /\^ =0 ifti 
is minimally unsatisfiable, and according to the remark above, this means 
that this resulting formula contains at least one copy of each ipi. This copy 
can only come from a clause ipij for some j £ {1, 2, 3}. Hence, we can deduce 
that the assignment I(xi) = Vi for I = 1, . . . , m sets the literal kj to true, 
and thus satisfies the clause on. Hence, this assignment satisfies the formula 
a. I 

3. Truth value of (l,2)-QCNF-formulas. 

3.1. Pure sub formulas. Let us first introduce a notion of purity over sets 
of universal literals that will be of use to characterize the truth value of 
(l,2)-QCNF-formulas. 

Definition 3.1 A (multi-)set of literals is pure if it does not contain both a 
variable x and its negation x. By extension, we call a (1,2)-QCNF -formula, 
F = yX3Yif(X, Y), pure if the set of universal literals occurring in ip is 
•pure. 

Proposition 3.2 A (1 ,2)-QCNF -formula is false if and only if it contains 
a false pure subformula. 

Proof: One direction is obvious. Suppose that the (l,2)-QCNF-formula F = 
\/X3Yip(X, Y) is false. Then, there is an assignment / to the universal vari- 
ables X such that for all assignment to Y, <p evaluates to false. Consider the 
subformula of F obtained in keeping only the clauses for which the universal 
literal is assigned by /, and deleting the other ones. This subformula is 
pure (it cannot contain both a clause with a universal variable x and another 
with x since either x or x is assigned 1 by I), and is false by the choice of 
/. I 

Now observe that the truth value of a pure (l,2)-QCNF-formula F is the 
same as the truth value of the existential 2-CNF formula Fy obtained in 
removing the universal literal in each clause and then deleting the universal 
quantifiers. Therefore, we can appeal to the work of Chvatal and Reed [3j 
in order to identify substructures that are sufficient (respectively, necessary) 
to ensure falsity. On the one hand Chvatal and Reed exhibited elementary 
unsatisfiable 2-CNF-formulas, called snakes. On the other hand they identi- 
fied extremal substructures, called bicycles, that appear in any unsatisfiable 
2-CNF-formula. Thus, we can define pure snakes and pure bicycles. 
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Definition 3.3 A pure snake of length s + 1 > 4, with s + 1 = 2t, is a 
set of s + 1 clauses Co, . . . ,C S which have the following structure: there is 
a sequence of s strictly distinct existential literals w±, . . . ,w s , and a pure 
sequence of s + 1 universal literals vo,...,v s such that, for every < r < s, 
C r = (v r V Itv V w r+ ±) with wo = w s+ i = Wt- 

Definition 3.4 A pure bicycle of length s + 1 > 3, is a set of s + 1 clauses 
Co, . . . , C s which have the following structure: there is a sequence of s strictly 
distinct existential literals w± , . . . , w s , and a pure sequence of s + 1 universal 
literals vo,...,v s such that, for < r < s, C r = (v r V V u> r +i) ; Co = 
(vo V u V w±) and C s = (v s V TTjJ V v) with literals u and v chosen from 
wi,...,w s ,wi,...,v% with (u,v) / (w s ,wi). 

Thus, we get the following proposition. 

Proposition 3.5 

• Every (1,2)-QCNF -formula that contains a pure snake is false. 

• Every (1,2)-QCNF -formula that is false, contains a pure bicycle. 

3.2. Enumerative results. 

Proposition 3.6 Let m be the number of universal variables and let n be 
the number of existential variables we can choose from. 

• The number of snakes of length s + 1 is 

(2) (n) s 2 s d(m,s + l) 
where 

min(m,s+l) / \ 

(3) d(m, 8 + l)= (JJ • 2 k ■ S(s + 1, k) ■ k\ 

with S(m,k) denoting the Stirling number of the second kind, and 
(n) s = (n - 1) • • • (n - s + 1). 

• Given a pure snake Ao of length s + 1 = 2t. For every 1 < i < 2t — 1, 
let N mtS (i) denote the number of pure snakes B of length s + 1 such 
that Ao and B share exactly i clauses. Then for 1 < i < t — 1 
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and for t < i < 2t — 1 



(5) N m ,,(i) < 2(s + l) s 



21 (s + l)\h 



A=0 n " S 



(n),-^-* d(m,s + l-i) 



• XTie number of bicycles of length s + 1 is 

(6) [(2 S ) 2 -l](n) s 2 s d(m,s + l) 

Proof: Given a literal w, let |«;| denote its underlying variable. Observe that 
a snake of length s + 1 = 2t contains s distinct variables. Moreover, every 
variable \u)i\ appearing in a snake occurs exactly twice (once positively and 
once negatively), except for |u;o| which occurs four times (twice positively 
and twice negatively). This special variable will be called the double point of 
the snake. A snake can be described by a (circular) sequence of existential 
literals wo,wi,.. .w s (wo) (with wq = Wi), together with the corresponding 
pure sequence of universal literals vq, v±, . . . v s . 

Choosing a snake of length s + 1 comes down to choose a sequence of 
s strictly distinct literals w± , . . . , w s , and then choose the pure sequence of 
s + 1 universal literals vq , . . . , v s (they are not necessarily distinct but no 
literal can be the complement of another). Let d(m, s + 1) be the number of 
pure sequences of literals of length s + 1, having a set of m variables from 
which the literals can be built. Let us recall that S(m, k) ■ k\ is the number 
of applications from a set of m elements onto a set of k elements. A pure 
sequence of literals of length s + 1 is obtained by exactly one sequence of 
choices of the following choosing process. 

1. Choose the number k of different variables occurring in the sequence. 

2. Choose the k variables. 

3. For each such variable, choose whether it occurs positively or nega- 
tively. 

4. Choose their places in the sequence. 
This gives the announced number of snakes. 

Given a pure snake Aq of length s + 1 = 2t. Let A r m , s (i) be the number of 
pure snakes B of length s + 1 such that Ao and B share exactly i clauses. If 
i < 2t — 1, this number can be decomposed as 

N m , s (i)= J2 N mAhj) 
j>i+l 
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where N mjS (i,j) is the number of pure snakes B such that Aq and B share 
exactly i clauses and j variables. In the rest of the proof, for more readability 
we omit the subscripts m,s in N mtS (i,j), thus writing N(i,j). Now we are 
looking for upper bounds on the N(i,j). 

Let us note that the intersection of Aq and B can be read on the (circular) 
sequence of literals wq, w±, . . . wt, . . . w s (wo), where wt = wq. In order to get 
i clauses and j variables in common, one has to choose k = (j — i) blocks of 
consecutive literals in this sequence. We make a case distinction according 
to whether the two snakes Aq and B have the same double point or not. 

• N a (i, j) denotes the number of pure snakes B of length s + 1 such that 
Aq and B share exactly i clauses and j variables, and have the same 
double point \wq\, 

• N b (i, j) denotes the number of pure snakes B of length s + 1 such that 
Aq and B share exactly i clauses and j variables, and do not have the 
same double point. 

Thus N(i,j) = N a (i,j)+N b (i,j). 

Let us first consider N a (i,j). Observe that in the special case when j = 
i + 1 (only one block), and Aq and B have the same double point, then i is 
necessarily equal to or larger than t. Therefore, 

(7) forl<t<t-l, N a (i,i + 1) = . 

In the general case, to count N a (i,j), we perform the following sequence of 
choices : 

(i) the intersection Aq n B such that it has i clauses and j variables, 

(ii) the sequence of strictly distinct existential literals that are in B \ (Aq n B) 
(Hi) the places of the k blocks of Aq n B among the literals chosen in (ii), 

(iv) the universal literals occurring in the clauses of B \ (Aq n B). 

Step (i). To build the intersection Aq n B, we choose 2k literals in the 
sequence representing Aq. They represent the first and last literals of the k 
blocks of AqPiB. The first literal is chosen after or at ujq. To define completely 
the intersection, we need to know whether this first literal is the beginning 
or the end of a block, so we get at most 2( s 2 l fc 1 ) < (s + l) 2k possible choices. 

Step (ii). Notice that \wq\ is the double point of B. So, it remains only to 
choose a sequence of s — (j — 1) strictly distinct literals. Thus, we have at 
most (n) s+ i_j2 s+1_J possible choices. 

Step (Hi). We need to choose how the k blocks will be plugged among the 
"remaining literals" chosen in (ii). This leads to at most (s + l) k possible 
choices. 
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Step (iv). There are s + 1 — i universal literals to choose, and they must 
be chosen in a pure way. So, there are at most d(m, s + 1 — i) choices. 
Thus, since k = j — i we obtain that for 1 < i < 2t — 1, j > i + 1 

(8) N a (i, j) <(n-s) (ii±i)!y "(n)^^ d(m, s + 1 - i) . 

\ n — s t 

The enumeration of N b (i,j) differs from the one of N a (i,j) only at step 
(ii). Indeed, when B does not have |wo| as a double point, at step (ii) we 
have first to choose a sequence of s — j strictly distinct literals (thus having 
determined the s variables occurring in B), and then choose one of these s 
variables as the double point. Hence, we have at most s(n) s _j2 s ~ J choices. 
Thus, we get for 1 < i < 2t — 1 and j > i + 1 

(9) N b (i,j) < s( (g + 1)3 y~V) g _,2^ d(m, s + l-i). 

V n — s ' 

Then, equation g]) follows from ©, ©and © while © follows from © 
and ©. 

The enumeration of bicycles is similar to the one of snakes. We just have to 
choose in addition u and v among wi, . . . , w s ,wi, . . . , such that (u, v) ^ 
(w^,wi). This explains the extra factor, [(2s) 2 — 1], in ([6]) 

I 

4. Location of the transition for (1,2)-QSAT. We consider formu- 
las built on n universal variables and m existential variables. Thus we have 

N = m 2 3 = 4mn(n — 1) different clauses at hand. We may establish our 

result in considering random formulas obtained by taking each one of the 
-/V possible clauses independently from the others with probability p G]0, 1[. 
Let c > 0, it is well known, see for instance |14} Sections 1.4 and 1.5], that 
the threshold obtained in this model translates to the model alluded to in 
the introduction - in which L = [cn\ , distinct clauses are picked uniformly 
at random among all the N possible choices when p = 4rnw ^ t _ 1 ) ■ Thus, 
from now on we shall always suppose that p = 5^—, and we continue to 
denote by P mjC (n) the probability that a random formula in this model is 

satisfiable. We are interested in studying lim ¥ m c (n) as a function of the 

n— »+oo ' 

parameters m and c. Any value of c such that P m c (n) — > 1 (resp. such that 
^m,c(n) — > 0) gives a lower (resp. upper) bound for the threshold effect as- 
sociated to the phase transition. 



THE THRESHOLD FOR RANDOM (1,2)-QSAT 



11 



Let us recall that the 2-SAT property exhibits a sharp transition, with 
a critical value equal to 1 (see [I] and [13]). From this result it is easy to 
deduce that the phase transition from satisfiability to unsatisfiability for 
(1,2)-QCNF formulas occurs when 1 < c < 2. 

Proposition 4.1 Let m = m(n) be any sequence of integers. 

• If c < 1 then P mc (n) > 1. 

• Ifc> 2 then P mc (n) ► 0. 

' n^oo 

Proof: Let F be a random (l,2)-QCNF-formula. Let us consider Ft, the 
2-CNF formula obtained from F by setting all the variables x\, . . . ,x m to 
true and omitting all quantifiers. If F is satisfiable, then so is Ft- Notice 
that Ft can be obtained by picking independently each possible 2-clause 
with probability 

g (n) = l-(l-p(n)r = £ + o(i). 
Thus the average number of clauses in Ft is equal to 

4 (2) ' q ~ c / 2 ' n - 

It follows from the threshold of 2-SAT [U [T3] that Ft is unsatisfiable with 
probability tending to 1 if c > 2. Thus, the same holds for F. 

Now, we look at the existential part of the formula, Fy. Observe that if 
Fy is satisfiable, then so is F. In Fy, each of the 4Q) 2-clauses appears 
independently with probability 

q'(n) = 1-(1- p( n )f m = ^- + 0^y 

Therefore, the threshold of 2-SAT tells us that when c < 1, the formula Fy 
is satisfiable with probability tending to one. The same holds for F. I 



5. Proof of the main result. 

5.1. General inequalities. Let B s and X s be respectively the number 
of pure bicycles and pure snakes of length s + 1 in a random (1,2)-QCNF 
formula. Les us recall that in such a formula, each clause is chosen with 
probability p = . Hence, if E m C (B S ) and E m C (AT S ) denote the average 
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number of bicycles and snakes of length s + 1 in a random (1,2)-QCNF 
formula, we get from , ([3|) and ([6]) the following two equations: 

(10) E m , c (X s ) = p s+1 (n) s 2 s d(m, s + 1) 

(11) E m ,c(B s ) = E m , c (X s )((2s) 2 - 1). 

In order to prove that c* is the critical value for the (decreasing) satisfiability 
property for (l,2)-QCNF-formulas, we will use two sequences of inequalities. 
The first one follows from Proposition 13.51 and Markov inequality applied on 
the number of bicycles. We have 

(12) 1 - P m , c (n) < Pr(j2 5 S > l) < E E mA B s)- 

s>2 s>2 

The second one is obtained in considering the number of snakes. Proposition 
13.51 and a general exponential inequality given in |14t Theorem 2.18 ii)] show 
that for any s > 3 

(13) P m c (n) < Pr(X s = 0) < exp ( E m c (X s ) \ 

Finally, recall that we can suppose that 1 < c < 2, according to Proposi- 
tion 

5.2. When the critical ratio is equal to 2. Let us start with a proposition 
which enables to control the mean number of bicycles for any c in ]1, 2[. 

Proposition 5.1 For any 1 < c < 2, the following statements hold when n 
tends to infinity 

• ifm< ^ then ^E miC (5 s ) = o(l) 

• if m = [ahmj with a In 2 > 1 then ^ E mjC (i? s ) = o(l). 



Proof: Let us recall that the coefficient d(m,s + 1) occurring in E mjC (S s ) 
is the number of pure sequences of literals of length s + 1, when we have 
m variables from which the literals can be built. Note that d(m,s + 1) is 
bounded from above by 2 mm 'f m ' s+1 ^ times the number of applications from 
{1, . . . , s + 1} to {1, . . . , m}. Therefore, 

(14) d(m, s + 1) < 2 min { m ' s+1 W +1 . 
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c s+1 s 2 

From (fTTT) . it follows that if s < m then K miC (B s ) < — - — . Thus 
(15) E ^mABs) < (^y) 



n 

,2 C 



s<m 



m 

n 



/c\s+i .,2 ! " 

If s > m, then (|14p gives E mc (-B s ) < - s z — . When < x < 1 and 
r > 2, standard computations show that 



s=r \ I 

Hence we get 



c2 m r 2 (§) r 



(17) £E m , c (i? s )<— - <o-v 

^ nil — c/2 r 

The proof of Proposition 15.11 is now an easy consequence of (|15p and (|17p . I 
Theorem 11.11 when a In 2 < 1 follows from Proposition 15. 11 inequality ()12p 

and Proposition 14.11 

In the sequel, we consider the case where m = [ahmj, with a > l/ln2. 

5.3. The critical ratio as a function of a. The main difficulty when deal- 
ing with E, mfi (B s ) and E m>c (X s ) is to handle the coefficient d(m, s + 1) given 
in Proposition 13.61 

min(m,s+l) / \ 

d(m,s + l)= J2 i^J -S(s + l,k)-k\ . 
First, let us denote for 1 < k < min(m, s + 1) 



(18) G m Jk, s + l) = 2 s (n) s ( ™ )2 k S(s + 1, k) k\ 



k I V4mn 



s+l 



From (flOl) and (fTTI) . the behavior of E mjC (X s ) and E m;C (.B s ) is clearly gov- 
erned by the coefficients G m c (k, s + 1). Indeed, since p = we get 

min(m,s+l) 

(19) E miC (B s ) = E G m , c (k,s + l)({2s) 2 -1) = ((2s) 2 - l)E m , c (A%) 
k=i 
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Second, we will need better bounds than the one given in (I14p . We will 
use well-known estimates for binomial coefficients. If 1 < b < a, then the 
following inequalities hold: 



1 /a\° ( a \ a - b fa\ /a x b 



< 20 » ia{- b )-{— b ) £ W-V& 

Then, from [T5], we have the following bounds for Stirling numbers of the 
second kind. There exist K > and K' > such that, for 1 < b < a, the 
following inequalities hold: 
(21) 




-) a x b - a <b\S{a,b) <K'Vb(?—-^) (-] .r 



b—a 




where xq > is a function of b/a defined implicitly for 6 < a by 1 — e x ° = 
^xq, and for a = b by xq = 0. The conventions are that 0° = 1 and = 1. 

By using these precise results, already used in [S] and [5], it appears that 
the behaviour of the coefficients G mjC (/c, s + 1) and so the one of the average 
number of snakes or bicycles, is governed by a continuous function of several 
real variables. From (1181). (1201) and (1211) we obtain: 



Proposition 5.2 There exist A > and B > such that for any c > 0, for 

every positive integers n, m, s and k such that k < min(m, s + 1) : 
(22) 

A ( n ] s ^ n 9 ^-^'^ < G mc (k, s + l)< B^R n 9 ^^'^ 
n s ^/m(s + 1) 

where g afi is the continuous function on V a = {(/3,7) | < (3 < a and (3 < 
7} defined for < (3 < 7 by 

(23) g a ,Ml) =ln 1 n ] 



e 



-xq and g a ,c((3,f3) = In 
7 



- 



e Vea/ (a - /?) Q -^ 



Recall that we have taken m = [a In nj . Observe that the second part of 
Proposition 15.11 together with (jlip indicates that long snakes, and similarly 
long bicycles, of length ^> Inn, have asymptotically no chance to appear 
when a > l/ln2 and c E]l,2[. Therefore, in our study we will focus on 
snakes of length proportional to Inn. Hence, let us set (3 = kj Inn, 7 = 
(s + l)/lnn. The following result will point out for each a, the values of k 
and s that contribute the most to the average number. Indeed we will prove 
the following central result : 
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Proposition 5.3 Let 1 < c < 2, and for any a let V a be the following 
domain 

V a = {(/?, 7) I < (3 < a and (3 < 7}. 

The function g a>c defined by t23\) has a global maximum on D a , given by its 
unique stationarity point in T> a . More precisely 

(24) maxg a>c (/3,i) = g a , c 0(a, c), 7(0, c)) = aH(c) - 1 

, - 2a(c-l) ^ -2aln(2-c) „. . , /2 \, . 

twtft /? = — S 7 = -, H(c) = lnc+ (- - 1) ln(2 - c . 

c c _ Vc / 

Moreover, for any domain V a C T> a such that (/?, j) ^ V a then 

(25) max (7 QiC (/3, 7) < aH{c) — 1 . 

The proof of this result is rather technical, so we postpone it to the next 
section. 

Now we can prove Theorem 11.11 when a In 2 > 1. In other words that, 
when a In 2 > 1, the critical ratio c*(a) is the unique root of a H(c) = 1. 
For this, we will use two corollaries of Proposition 15.21 and Proposition 15.31 

Corollary 5.4 Let a > l/ln2 and c < 2 be such that aH{c) < 1. Then, as 
n tends to infinity 

s>2 

Proof: From Proposition 15.11 we have E^ a i nn j c (B s ) = o(l). 

s >2 1 ° 1 „ n2 r 1 Inn 

— In 2 — In c 

Then, from (fT9|h the upper bound (|22p and ([2JJ) we get 

■ Inn 



In 2 — In c 



n 



with6» n -> g a , c 0,l) = aH(c)-l < 0. Therefore, ^ E LalnnJ c (5 s 



o(l). I 



With (|12p . this corollary proves that, when a In 2 > 1 and for any c < 
c*(a), we have P TOiC (n) = 1 — o(l). 

In considering f|13f) with s + 1 = L7lnnJ = 24, it will follow easily from 
the corollary given below that, when a In 2 > 1, for any c > c*(a) we have 
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^m,c( n ) = o(l). This will end the proof of Theorem 11.11 Note that the 
coefficients N mtS (i) appearing in the following corollary are the ones defined 
in Proposition 13.61 

Corollary 5.5 Let a > l/ln2 and c < 2 be such that aH(c) > 1, and let 
s + 1 = [7 In raj . Then there exist < 5 < 2(aH(c) - I), C > and D > 
such that 



and 



Ei Ql „„i, c (l s ) > C n" 3 ®- 1 - 



1=1 



Proof: From (|25p in Proposition 15.31 we first choose 5 £]0,2(aH(c) — 1)[ 
such that 

max 



{([3,~f)s.t.-y<UnT) a 



9a,c{P,l) < max.g a , c (P,i) - 5. 



Again in using (|19p and the lower bound in (|22|) . we can find C > such 
that for s + 1 = [7 In nj 

E [alnnUc (X s )>Cn^^--2. 

As g a ,c(f3, 7) = aH(c) — 1, the first assertion is proved. 

Then, with p = , from (|4]) and Q we get first for 1 < i < t 

min(m,s+l) 

N myS (i)p s+1 -*<2(s + lf Yl 

fc=l 

and second for t < i < 2t — 1 
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,h=l 



n 



G mtC (k,s + l-i) 



min(m,s+l) 

N mtS (i)p s+1 - l <2(s + l) 3 £ 

fc=l 



■ 2* 

£( 

7i=0 



G ni)C {k 1 s + 1 - i). 



At last, in using (I22p with s + 1 = |_7hinJ and with our choice for 5 we 
obtain 

-Amn' 



t-i 



E Ar L«mnj,,«(A) " < A(lnn)¥ n<* H 



(c)-2 



i=l 
2t-l 



E^Laln«j lS «(£-) S+1 "<^2(ln 



n )f 77 , a ' ff ( c )- 1 - <5 . 



1=1 
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6. Proof of Proposition [5T3l Let us recall that for any 1 < c < 2 and 

a > 0, we consider the domain V a = {(/?, 7) | < (3 < a and (3 < 7} for the 

function g a c given from (f23|) by 

(26) 

r ci 1 r2(e x ° - 1) 

g ac (/3,j) = -l + alna-(a-/3)ln(a-/3) + 7ln — + /?ln 

L2exoa-l L 

(27) 9a,c{P,0) = -1 + alna- (a - /?) ln(a - 0) + /?ln[ ' 



P 



eat 



with xo defined implicitly when < (3 < 7 by 

(28) 1 - e~ x ° = -x 

7 

In the sequel, we shall write g for g QjC and T> for P Q . 

Proposition 15.31 tells us that g has a strict and global maximum on T> 

which is equal to aH(c) — 1 with H(c) = lnc+ ^ lj ln(2 — c). The proof 

of Proposition 15.31 follows from the following claim : 

Claim 6.1 For any 1 < c < 2 and a > 0, 

1. for every fixed (3 with < (3 < a, the function 7 1— > g(/3, 7) is strictly 
concave on \B, +oo[ with a strict maximum at = — ln( — ) . 

2. the function (3 1— > g({3, 73) is strictly concave on ]0,a] with a maxi- 

~ 2a(c-l) , . ,^ -2aln(2-c) _ 

mum at (3 = , men with 7 := 7-y = , g(p,7) = 

c P c 

aH(c) - 1. 

Proof: For the first point of this claim we compute, from (|26|) and (|28|) . the 
partial derivatives of g with respect to 7. We get 

(29) ^)=m(^L) and |!| (A7) - 7-/3xo 



c?7 ' \2xoaJ ' di 2 ' 7(7 - /3(x + 1)) ' 

With ([28]) we first observe that 
(30) 7 - (3x = je~ x ° > 0. 

Then 

7 - /3(x + 1) = 7 - /^o - /3 
= 7e- xo - /3 

-xo _ 7(1 ' 



-X0 X 



7e 



X 



— {x e- x ° -l + e~ X0 ) 

Xq 
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let ip(x) = xe~ x — 1 + e~ x . The function ip is decreasing with <p(0) = 0. 
Hence, (f(xo) < and 

(31) 7 - P(xq + 1) < 0. 



From the second identity in (l29l) . (l30l) and (I3TI) we conclude that -^-^{0, 7) < 



0. The strict concavity of 1— > g(/3, 7) follows. Then the first identity in (|29p 
and (I28p give the expected formula for the unique extremum, indeed we 
obtain 

2xqu 2a ( 2a \ f3c 



(32) 7 / 3 = ^ = -ln(^-)ande- 

v ; w c c \2a-BcJ 



2a — f3c' 2a — (3c 



For the second point of the claim, observe that with (|26p we have : 



cry 1 2(e x ° — 1) 
g(/3,-y) = -l + 7ln — - 7 + alna - (a - (3) hxia - (3) + /31n — '- , 

L2xo« J P 

thus from ()32[) we obtain 

(33) g((3,^) = -l + a K c (^) 

where for any x G]0, 1[, K c (x) = x lnc+ ^ x^j ln^l — — (1 — x) ln(l — x). 

K c is strictly concave on ]0,1[ and reaches its maximum at x = ' 



From (l33l) with — = — we get maxg(/3,7/3) = — 1 + a K c (—\ = — 1 + 

a c /3>o Va/ 

tti \ mi -i ^ttttt, 1 2a. / 2a \ — 2aln(2 — c) 
aHic). Ihen, with (1321) we obtain 7-? = — in — = : = 

p c ^2a-j3c J c 

7 • 

9<? /2(e x ° — l)(a — (3) \ - 
At last, observe that — 7) = In I , so (3 and 7 give 



the coordinates of the unique stationarity point of g, that is the unique so- 
lution of 009,7) = ^09,7) = 0. 



I 



7. Conclusion. We have performed an extensive study of a natural 
and expressive quantified problem, (1,2)-QSAT. We have proved the exis- 
tence of a sharp phase transition from satisfiability to unsatisfiability for 
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(l,2)-QCNF-formulas and we have given the exact location of the thresh- 
old. The obtained results have several interesting features. The parameter 
m, which is the number of universal variables, controls the worst-case com- 
putational complexity of the problem (which is ranging from linear time 
solvable to coNP-complete), as well as the typical behavior of random in- 
stances. When m is small, there is a sharp threshold at c = 2. On the other 
side, when m is large enough, actually when m >> Inn, there is a sharp 
threshold at c = 1: the analysis is similar, and in fact easier, to what we 
have done for pure snakes in Section [5l in considering snakes with strictly 
distinct universal variables, as shown in [6]. This fact should be compared 
to the fact that the threshold location c*(a) for m = [a hi raj goes to 1 when 
a goes to infinity More importantly, an original regime is observed when 
m = [a In raj . Using counting arguments on pure bicycles, which are the seed 
of unsatisfiability, and on pure snakes, which are special minimally false for- 
mulas, we got respectively a lower and an upper bound for the threshold. It 
turns out that these two bounds coincide, thus giving the exact location of 
the threshold function of a. 

A challenging question would be to determine the scaling window around 
c*(a) and get precise information on the typical contradictory cycles that 
occur in random formulas inside this window. 
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